NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

The Law of Knowledge Overshadowing: Towards Understanding, Predicting and Preventing LLM Hallucination

https://doi.org/10.18653/v1/2025.findings-acl.1199

Zhang, Yuji; Li, Sha; Qian, Cheng; Liu, Jiateng; Yu, Pengfei; Han, Chi; Fung, Yi R; McKeown, Kathleen; Zhai, ChengXiang; Li, Manling; et al (January 2025, Association for Computational Linguistics)

Full Text Available
Bridging Text Data and Graph Data: Towards Semantics and Structure-aware Knowledge Discovery

https://doi.org/10.1145/3616855.3636450

Jin, Bowen; Zhang, Yu; Li, Sha; Han, Jiawei (March 2024, ACM)

Graphs and texts are two key modalities in data mining. In many cases, the data presents a mixture of the two modalities and the information is often complementary: in e-commerce data, the product-user graph and product descriptions capture different aspects of product features; in scientific literature, the citation graph, author metadata, and the paper content all contribute to modeling the paper impact.
more » « less
Full Text Available
Text2DB: Integration-Aware Information Extraction with Large Language Model Agents

https://doi.org/10.18653/v1/2024.findings-acl.12

Jiao, Yizhu; Li, Sha; Zhou, Sizhe; Ji, Heng; Han, Jiawei (January 2024, Association for Computational Linguistics)

Full Text Available
MACAROON: Training Vision-Language Models To Be Your Engaged Partners

https://doi.org/10.18653/v1/2024.findings-emnlp.454

Wu, Shujin; Fung, Yi; Li, Sha; Wan, Yixin; Chang, Kai-Wei; Ji, Heng (January 2024, Association for Computational Linguistics)

Full Text Available
Sub-two-cycle gigawatt-peak-power LWIR OPA for ultrafast nonlinear spectroscopy of condensed state materials

https://doi.org/10.1364/OL.500550

Leshchenko, Vyacheslav; Li, Sha; Agostini, Pierre; DiMauro, Louis F. (September 2023, Optics Letters)

The application of high-power, few-cycle, long-wave infrared (LWIR, 8–20 µm) pulses in strong-field physics is largely unexplored due to the lack of suitable sources. However, the generation of intense pulses with >6 µm wavelength range is becoming increasingly feasible with the recent advances in high-power ultrashort lasers in the middle-infrared range that can serve as a pump for optical parametric amplifiers (OPA). Here we experimentally demonstrate the feasibility of this approach by building an OPA pumped at 2.4 µm that generates 93 µJ pulses at 9.5 µm, 1 kHz repetition rate with sub-two-cycle pulse duration, 1.6 GW peak power, and excellent beam quality. The results open a wide range of applications in attosecond physics (especially for studies of condensed phase samples), remote sensing, and biophotonics.
more » « less
Full Text Available
High-order harmonic generation from a thin film crystal perturbed by a quasi-static terahertz field

https://doi.org/10.1038/s41467-023-38187-0

Li, Sha; Tang, Yaguo; Ortmann, Lisa; Talbert, Bradford K; Blaga, Cosmin I; Lai, Yu Hang; Wang, Zhou; Cheng, Yang; Yang, Fengyuan; Landsman, Alexandra S; et al (December 2023, Nature Communications)

Abstract Studies of laser-driven strong field processes subjected to a (quasi-)static field have been mainly confined to theory. Here we provide an experimental realization by introducing a bichromatic approach for high harmonic generation (HHG) in a dielectric that combines an intense 70 femtosecond duration mid-infrared driving field with a weak 2 picosecond period terahertz (THz) dressing field. We address the physics underlying the THz field induced static symmetry breaking and its consequences on the efficient production/suppression of even-/odd-order harmonics, and demonstrate the ability to probe the HHG dynamics via the modulation of the harmonic distribution. Moreover, we report a delay-dependent even-order harmonic frequency shift that is proportional to the time derivative of the THz field. This suggests a limitation of the static symmetry breaking interpretation and implies that the resultant attosecond bursts are aperiodic, thus providing a frequency domain probe of attosecond transients while opening opportunities in precise attosecond pulse shaping.
more » « less
Full Text Available
Open-Domain Hierarchical Event Schema Induction by Incremental Prompting and Verification

https://doi.org/10.18653/v1/2023.acl-long.312

Li, Sha; Zhao, Ruining; Li, Manling; Ji, Heng; Callison-Burch, Chris; Han, Jiawei (January 2023, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics)

Event schemas are a form of world knowledge about the typical progression of events. Recent methods for event schema induction use information extraction systems to construct a large number of event graph instances from documents, and then learn to generalize the schema from such instances. In contrast, we propose to treat event schemas as a form of commonsense knowledge that can be derived from large language models (LLMs). This new paradigm greatly simplifies the schema induction process and allows us to handle both hierarchical relations and temporal relations between events in a straightforward way. Since event schemas have complex graph structures, we design an incremental prompting and verification method INCPROMPT to break down the construction of a complex event graph into three stages: event skeleton construction, event expansion, and event-event relation verification. Compared to directly using LLMs to generate a linearized graph, INCPROMPT can generate large and complex schemas with 7.2% F1 improvement in temporal relations and 31.0% F1 improvement in hierarchical relations. In addition, compared to the previous state-of-the-art closed-domain schema induction model, human assessors were able to cover ∼10% more events when translating the schemas into coherent stories and rated our schemas 1.3 points higher (on a 5-point scale) in terms of readability.
more » « less
Full Text Available
Instruct and Extract: Instruction Tuning for On-Demand Information Extraction

https://doi.org/10.18653/v1/2023.emnlp-main.620

Jiao, Yizhu; Zhong, Ming; Li, Sha; Zhao, Ruining; Ouyang, Siru; Ji, Heng; Han, Jiawei (January 2023, Association for Computational Linguistics)

Full Text Available
A multifunctional Wnt regulator underlies the evolution of rodent stripe patterns

https://doi.org/10.1038/s41559-023-02213-7

Johnson, Matthew R.; Li, Sha; Guerrero-Juarez, Christian F.; Miller, Pearson; Brack, Benjamin J.; Mereby, Sarah A.; Moreno, Jorge A.; Feigin, Charles Y.; Gaska, Jenna; Rivera-Perez, Jaime A.; et al (December 2023, Nature Ecology & Evolution)

Full Text Available
Eider: Empowering Document-level Relation Extraction with Efficient Evidence Extraction and Inference-stage Fusion

https://doi.org/10.18653/v1/2022.findings-acl.23

Xie, Yiqing; Shen, Jiaming; Li, Sha; Mao, Yuning; Han, Jiawei (January 2022, Association for Computational Linguistics)

Full Text Available

« Prev Next »

Search for: All records